Search CORE

Results of the WMT15 Metrics Shared Task

Author: Bojar Ondrej
Kamran Amir
Koehn Philipp
Stanojevic Milos
Publication venue
Publication date: 01/01/2015
Field of study

This paper presents the results of the WMT15 Metrics Shared Task. We asked participants of this task to score the outputs of the MT systems involved in the WMT15 Shared Translation Task. We collected scores of 46 metrics from 11 research groups. In addition to that, we computed scores of 7 standard metrics (BLEU, SentBLEU, NIST, WER, PER, TER and CDER) as baselines. The collected scores were evaluated in terms of system level correlation (how well each metric's scores correlate with WMT15 official manual ranking of systems) and in terms of segment level correlation (how often a metric agrees with humans in comparing two translations of a particular sentence)

Biblio at Institute of Formal and Applied Linguistics

The WMT'18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English

Author: Bojar Ondrej
Burlot Franck
Grönroos Stig-Arne
Koponen Maarit
Nieminen Tommi
Ravishankar Vinit
Scherrer Yves
Yvon François
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/2018
Field of study

Peer reviewe

Aaltodoc Publication Archive

Helsingin yliopiston digitaalinen arkisto

Ten Years of WMT Evaluation Campaigns: Lessons Learnt

Author: Bojar Ondrej
Federmann Christian
Haddow Barry
Koehn Philipp
Post Matt
Specia Lucia
Publication venue
Publication date: 01/01/2016
Field of study

The WMT evaluation campaign (http://www.statmt.org/wmt16) has been run annually since 2006. It is a collection of shared tasks related to machine translation, in which researchers compare their techniques against those of others in the field. The longest running task in the campaign is the translation task, where participants translate a common test set with their MT systems. In addition to the translation task, we have also included shared tasks on evaluation: both on automatic metrics (since 2008), which compare the reference to the MT system output, and on quality estimation (since 2012), where system output is evaluated without a reference. An important component of WMT has always been the manual evaluation, wherein human annotators are used to produce the official ranking of the systems in each translation task. This reflects the belief of theWMTorganizers that human judgement should be the ultimate arbiter of MT quality. Over the years, we have experimented with different methods of improving the reliability, efficiency and discriminatory power of these judgements. In this paper we report on our experiences in running this evaluation campaign, the current state of the art in MT evaluation (both human and automatic), and our plans for future editions of WMT

Biblio at Institute of Formal and Applied Linguistics

Edinburgh's Statistical Machine Translation Systems for WMT16

Author: Bojar Ondrej
Haddow Barry
Huck Matthias
Nadejde Maria
Sennrich Rico
Williams Philip
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

This paper describes the University of Edinburgh’s phrase-based and syntax-based submissions to the shared translation tasks of the ACL 2016 First Conference on Machine Translation (WMT16). We submitted five phrase-based and five syntaxbased systems for the news task, plus one phrase-based system for the biomedical task

Biblio at Institute of Formal and Applied Linguistics

Moses: Open Source Toolkit for Statistical Machine Translation

Author: Bertoldi Nicola
Birch Alexandra
Bojar Ondrej
Callison-Burch Chris
Constantin Alexandra
Cowan Brooke
Dyer Chris
Federico Marcello
Herbst Evan
Hoang Hieu
Koehn Philipp
Moran Christine
Shen Wade
Zens Richard
Publication venue
Publication date: 01/01/2007
Field of study

We describe an open-source toolkit for statistical machine translation whose novel contributions are (a) support for linguistically motivated factors, (b) confusion network decoding, and (c) efficient data formats for translation models and language models. In addition to the SMT decoder, the toolkit also includes a wide variety of tools for training, tuning and applying the system to many translation tasks

CiteSeerX

Archivio della ricerca - Fondazione Bruno Kessler

Findings of the WMT 2017 Biomedical Translation Shared Task

Author: Bojar Ondrej
Boyer Arthur
Grozea Cristian
Haddow Barry
Jimeno Yepes Antonio
Kittner Madeleine
Lichtblau Yvonne
Neveol Aurelie
Neves Mariana
Pecina Pavel
Roller Roland
Rosa Rudolf
Siu Amy
Thomas Philippe
Trescher Saskia
Verspoor Karin
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Automatic translation of documents is an important task in many domains, including the biological and clinical domains. The second edition of the Biomedical Translation task in the Conference of Machine Translation focused on the automatic translation of biomedical-related documents between English and various European languages. This year, we addressed ten languages: Czech, German, English, French, Hungarian, Polish, Portuguese, Spanish, Romanian and Swedish. Test sets included both scientific publications (from the Scielo and EDP Sciences databases) and health-related news (from the Cochrane and UK National Health Service web sites). Seven teams participated in the task, submitting a total of 82 runs. Herein we describe the test sets, participating systems and results of both the automatic and manual evaluation of the translations

Fraunhofer-ePrints